Fudan at TRECVID 2015: Adaptive Feature Fusion for Multimedia Event Detection in Videos
نویسندگان
چکیده
TRECVID 2015 [4] Multimedia Event Detection (MED) is an interesting and challenging task on the detection of high level complex events in Internet videos [1]. In this notebook paper, we present an overview of our system, focusing on combining multiple feature representations to improve the performance. Specifically, with the outputs of the multiple features, we adopt a simple yet e↵ective fusion method to generate the final predictions, where the optimal fusion weights are learned adaptively for each class, and the learning process is regularized by automatically estimated class relationships. Our MED submissions include 5 system runs for the Pre-Specified (PS) sub-task and 2 runs for the AdHoc (AH) sub-task under 010Ex training condition. Very competitive results are obtained, which verify the e↵ectiveness of both deep features and our fusion method. Table 1: A summary of our submissions. Features Fusion AH Run-1 VGG19-fc67, VGG19-20K, FCVID-233, Conventional Fea. Adaptive Run-2 VGG19-20K, FCVID-233, Conventional Fea. Adaptive PS Run-1 VGG19-fc67, VGG19-20K, FCVID-233, Conventional Fea. Adaptive Param-1 Run-2 VGG19-fc67, VGG19-20K, FCVID-233, Conventional Fea. Average Run-3 VGG19-fc67, VGG19-20K, FCVID-233, Conventional Fea. Adaptive Param-2 Run-4 VGG19-20K, FCVID-233, Conventional Fea. Adaptive Run-5 VGG19-fc67, VGG19-20K, Conventional Fea. Adaptive 1. SYSTEM DESCRIPTION Figure 1 gives an overview of our system, consisting of three components, namely feature extraction, classification and adaptive fusion. We briefly introduce each of them in the following.
منابع مشابه
NTTFudan Team at TRECVID 2016: Multimedia Event Detection
The TRECVID 2016 Multimedia Event Detection (MED) challenge evaluates the detection performances of high level complex events in Internet videos with limited number of positive training examples [1]. In this notebook paper, we present an overview of our system, highlighting on the selection and fusion of multiple classification models from a wide range of feature representations to improve the ...
متن کاملMultimedia Event Detection – Strong by Multi-Modality Integration
We will present our Multimedia Event Detection system with positive video exemplars (event query is defined by 10 or 100 positive videos), which achieves state-of-the-art performance by designing different fusion strategies for different modalities. First, in visual system, the standard fusion strategy is averaging probability scores obtained by different features. This strategy could achieve r...
متن کاملNEU MITLL @ TRECVid 2015: Multimedia Event Detection by Pre-trained CNN Models
We introduce a framework for multimedia event detection (MED), which was developed for TRECVID 2015 using convolutional neural networks (CNNs) to detect complex events via deterministic models trained on video frame data. We used several well-known CNN models designed to detect objects, scenes, and a combination of both (i.e., Hybrid-CNN). We also experimented with features from different netwo...
متن کاملComplex Event Detection via Event Oriented Dictionary Learning
Complex event detection is a retrieval task with the goal of finding videos of a particular event in a largescale unconstrained internet video archive, given example videos and text descriptions. Nowadays, different multimodal fusion schemes of low-level and high-level features are extensively investigated and evaluated for the complex event detection task. However, how to effectively select th...
متن کاملITI-CERTH participation to TRECVID 2012
This paper provides an overview of the tasks submitted to TRECVID 2012 by ITI-CERTH. ITICERTH participated in the Known-item search (KIS), in the Semantic Indexing (SIN), as well as in the Event Detection in Internet Multimedia (MED) and the Multimedia Event Recounting (MER) tasks. In the SIN task, techniques are developed, which combine video representations that express motion semantics with ...
متن کامل